Characterising paediatric mortality during and after acute illness in Sub-Saharan Africa and South Asia: a secondary analysis of the CHAIN cohort using a machine learning approach

Summary Background A better understanding of which children are likely to die during acute illness will help clinicians and policy makers target resources at the most vulnerable children. We used machine learning to characterise mortality in the 30-days following admission and the 180-days after discharge from nine hospitals in low and middle-income countries (LMIC). Methods A cohort of 3101 children aged 2–24 months were recruited at admission to hospital for any acute illness in Bangladesh (Dhaka and Matlab Hospitals), Pakistan (Civil Hospital Karachi), Kenya (Kilifi, Mbagathi, and Migori Hospitals), Uganda (Mulago Hospital), Malawi (Queen Elizabeth Central Hospital), and Burkina Faso (Banfora Hospital) from November 2016 to January 2019. To record mortality, children were observed during their hospitalisation and for 180 days post-discharge. Extreme gradient boosted models of death within 30 days of admission and mortality in the 180 days following discharge were built. Clusters of mortality sharing similar characteristics were identified from the models using Shapley additive values with spectral clustering. Findings Anthropometric and laboratory parameters were the most influential predictors of both 30-day and post-discharge mortality. No WHO/IMCI syndromes were among the 25 most influential mortality predictors of mortality. For 30-day mortality, two lower-risk clusters (N = 1915, 61%) included children with higher-than-average anthropometry (1% died, 95% CI: 0–2), and children without signs of severe illness (3% died, 95% CI: 2–4%). The two highest risk 30-day mortality clusters (N = 118, 4%) were characterised by high urea and creatinine (70% died, 95% CI: 62–82%); and nutritional oedema with low platelets and reduced consciousness (97% died, 95% CI: 92–100%). For post-discharge mortality risk, two low-risk clusters (N = 1753, 61%) were defined by higher-than-average anthropometry (0% died, 95% CI: 0–1%), and gastroenteritis with lower-than-average anthropometry and without major laboratory abnormalities (0% died, 95% CI: 0–1%). Two highest risk post-discharge clusters (N = 267, 9%) included children leaving against medical advice (30% died, 95% CI: 25–37%), and severely-low anthropometry with signs of illness at discharge (46% died, 95% CI: 34–62%). Interpretation WHO clinical syndromes are not sufficient at predicting risk. Integrating basic laboratory features such as urea, creatinine, red blood cell, lymphocyte and platelet counts into guidelines may strengthen efforts to identify high-risk children during paediatric hospitalisations. Funding 10.13039/100000865Bill & Melinda Gates FoundationOPP1131320.


Introduction
Despite reductions in under-five mortality, the vast majority of child deaths still occur in low-and middle-income countries (LMIC). 1 Children in these settings have a high risk of death during inpatient admission and for six to twelve months following discharge. 2,3 The Childhood Acute Illness and Nutrition (CHAIN) Cohort is a study of children admitted to urban and rural hospitals in sub-Saharan Africa and south Asia. 4,5 The cohort's primary analysis used survival analysis and structural equation modelling to estimate causal pathways to death based on the UNICEF framework for child mortality. 5 That analysis demonstrated that the proportion of deaths in the six months following hospitalisation approximated inpatient deaths, that anthropometry captures a broad range of pathways to mortality, and that caregiver employment and mental health were directly associated with post-discharge mortality.
Rigorous causal modelling is an ideal approach to test investigator driven hypotheses but it may overlook important risk phenotypes not previously identified in the literature or the model's conceptual framework. Machine learning (ML) minimises the number of analytic decisions based on expert opinion or previous literature, but has traditionally focused on predicting outcomes rather than characterizing risks. 6 However, model explanation tools have made it possible to form clusters of patients who the model predicts to be at similar risk of an outcome due to shared underlying characteristics. These explainable machine learning approach have been used to identify novel risk phenotypes in adult intensive care units, chronic obstructive pulmonary disorders, severe influenza admissions, and breast cancer, 7-10 but there are few examples of these methods being applied to paediatric care in LMICs. [11][12][13] We aimed to describe phenotypes associated with mortality and survival based on the clinical, anthropometric, laboratory, and sociodemographic features using ML techniques within the CHAIN cohort.

Study design and participants
A description of the CHAIN Cohort has been published previously. 4 We recruited children aged 2-23 months at admission to hospitals in Dhaka and Matlab (Bangladesh); Karachi (Pakistan); Kilifi, Nairobi and Migori (Kenya); Kampala (Uganda); Blantyre (Malawi); and Banfora (Burkina Faso) from November 2016 to January 2019. Children with traumatic injuries or conditions requiring surgery in the next six months were excluded. Each site oversampled children at high risk of mortality by stratifying enrolment by mid-upper arm circumference (MUAC) in the following ratio: two children with severe wasting or nutritional oedema (MUAC <11.5 cm if ≥6-months of age, MUAC <11.0 cm if <6-months of age, or bilateral pitting oedema), two with moderate wasting (MUAC <12.5 cm but ≥11.5 cm if ≥6-months of age, MUAC <12.0 cm but ≥11.0 cm if <6-months of age), and one with no wasting per week. Detailed clinical, anthropometric and sociodemographic

Research in context
Evidence before this study Despite implementation of WHO guidelines, recently published systematic reviews have shown that some groups of children in sub-Saharan Africa and Asia experience inpatient mortality rates of 15-25%, with similar subgroups of high-risk children suffering up to 20% mortality in the six months following hospital discharge. To understand the factors that identify these high-risk groups across different settings and disease-syndromes, we searched PubMed on February 2nd 2022 using the terms ("paediatric hospitalisation" or "paediatric post-discharge") and "mortality" for articles published in English, French or Spanish, assessing the causes and predictors of acute and post-discharge paediatric mortality during acute illnesses in sub-Saharan Africa and Asia. Fifty of 137 studies identified contained relevant data, but no studies explored risk factor across multiple settings and different disease-syndromes categories. Additional, very few studies employed machine learning techniques, which can reveal previously unseen or underappreciated patterns of mortality.

Added value of this study
We used machine learning to identify clusters of mortality and found the most influential predictors of mortality in the 30-days following admission to hospital, and the 180 days after discharge, were anthropometric measurements and routine haematological and biochemical tests. The three highest risk clusters derived from the 30-day model were defined by nutritional oedema without signs of sepsis or severe illness, high serum urea and creatinine, and nutritional oedema with signs of sepsis or severe illness. Among children who were discharged, the highest risk clusters were defined by longer than average length of stay, leaving against medical advice, and severely poor anthropometric status at discharge. No WHO/IMCI clinical syndromes were among the 25 most influential predictors of mortality in either model.

Implications of all the available evidence
Most children admitted to hospital in this study survived when managed with current guidelines. However, subgroups of children with severe wasting, nutritional oedema, signs of sepsis, or evidence of renal insufficiency experienced extremely high rates of mortality. Developing novel interventions to treat these groups is a potentially important alternative to continued investment in incremental changes to the current WHO syndromic management approach. Consistent access to biochemical and haematological testing may strengthen efforts to identify high-risk children during paediatric hospitalisations.

Articles
information was collected at admission and discharge. Bedside malaria testing (CARESTART or SD Bioline Ag Pf-Pan) and HIV-1/2 testing (Alere 2 Determine or Unigold) were performed at admission. Complete blood counts and biochemical analysis were performed on admission and discharge samples by local clinical laboratories. Fieldworkers also visited participants' home to collect global positioning system coordinates which were merged with geospatial datasets to calculate distance to the nearest health facility, and the admitting facility, and the population density in the square kilometre surrounding the household. 14 Outcomes, including mortality, were ascertained from enrolment to 180 days after discharge from the index hospitalisation. Ethical approval for this study was obtained from the University of Oxford and all relevant site specific committees. 4,5 Informed consent was given by the caregivers of all participants, and this manuscript was written in compliance with the STROBE and TRIPOD reporting guidelines.

Statistical analysis
Data from all children enrolled in the CHAIN cohort were included in this analysis. We analysed mortality during two time periods, reflecting the primary epidemiological analysis of CHAIN. 5 The first analysis assessed mortality in the 30-days following admission to hospital, the second evaluated mortality in the sixmonths following index hospital discharge.
Potential predictors of 30-day mortality included 207 admission variables (7 demographic, 184 clinical, 16 laboratory; Appendix 1). Clinical variables included a World Health Organisation/Integrated Management of Childhood Illness (WHO/IMCI) syndromic diagnoses for diarrhea, malaria, and pneumonia. Z-scores for weight-for-age (WAZ), and height-for-age (HAZ), and weight-for-height (WHZ) were calculated using the WHO AnthroPlus package. Many social variables were collected from caregivers after the child was medically stable, typically 48-h after admission. Consequently, social variables were not included in the 30-day model as they were missing in early deaths and missing data is used as a potential predictor by extreme gradient boosted models (XGBoost). The post-discharge model included 556 admission and discharge variables (11 demographic, 320 clinical, 39 laboratory and 186 socioeconomic, Appendix 1).
Data for both analyses were randomly split into a training (90%) and test sets (10%). XGBoost Cox proportional hazards models were fit in the training sets with ten-fold cross-validation (Fig. 1). 15,16 This model allows for participant censoring while also leveraging XGBoost's highly flexible ensemble approach. 7 To account for the stratified recruitment, inverse-probability weights for selection were included (Appendix 2). Model performance was measured on the test sets using the C-statistic which approximates the area under the curve (AUC). This process was repeated ten times, resulting in ten models and ten test set performances. The ensemble of these ten XGBoost models are further referred to as the final XGBoost model.  To interpret the XGBoost model, and expose the contribution of each variable to each child's predicted risk, we calculated SHapley Additive exPlanations (SHAP) values. 17 The 25 variables that provided the most information to the model were graphed. Variables outside the top 25 were found to make inconsequential contributions to the model. To understand if a simpler model would have similar predictive performance, we repeated the XGBoost estimation of the C-statistic using a decreasing number of predictors, beginning with the most influential 50 predictors, and iteratively removing the least informative remaining variable, re-estimating the C-statistic, and repeating this process until only one variable remained.
Using the SHAP values derived from the final XGBoost model, we clustered children at a similar predicted mortality risk due to shared underlying risk factors. Spectral clustering is superior to other clustering methods across varied scenarios, 18 but the number of clusters to be identified by the algorithm is specified by the user. Preliminary data analyses suggested that six clusters contained discrete groups of very low and very high-risk children. Sensitivity analyses in which the final XGBoost models remained constant, but the spectral clustering algorithm generated four, five, six, seven and eight clusters were conducted.
The number of children, the Kaplan Meier estimated cumulative mortality, and median time-to-death for each cluster was estimated during the observed time period. Finally, the distributions of the most influential predictors of mortality across these clusters were described using Cohen's D values. Cohen's D compares the difference in mean values of variable x in Cluster i to the mean values of variable x in all other clusters, over the standard deviation of variable x in the whole sample. These results are the mean difference between Cluster i and all other Clusters expressed as standard deviations (SD). Six common clinical variables were added to these influential predictors to aid interpretation of the clusters (HIV status, malaria rapid test result, presence of oedema, consciousness level, caregiver reported diarrhoea, diagnosis of sepsis (per clinician), left hospital against medical advice).
The XGBoost model, SHAP values, and spectral clustering analyses were conducted in Python (v3.6.10), descriptive statistics were computed in R (v3.6.2, R Foundation for Statistical Computing). Additional rationale for the above analytic choices are available in Appendix 1.

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. Four authors (KDT, NN, SF, GG) had access to the data and were responsible for the decision to submit for publication.

Results
The CHAIN cohort included 3101 children. The median age was 10.8 months (inter-quartile range [IQR]: 6.8, 15.7). Characteristics at admission to hospital are provided in Table 1, and discharge characteristics are in Appendix 3, Table S1. There were 350 deaths (11%) and 116 children (3.7%) were lost to follow-up.

30-day mortality
There were 234 (7.5%) deaths in the 30-days following admission. The median time-to-death was 5 days (IQR: 1, 11 days). The median C-statistic (AUC) of the 30-days models was 0.80 (IQR: 0.76, 0.83). The 25 variables providing the most information to the model are displayed in Fig. 2. MUAC was the strongest predictor of death. Twelve of the most influential 25 predictors were biochemical or haematological parameters. All four anthropometric measures (MUAC, WHZ, WAZ, LAZ) as well as younger patient age, lower heart rate, higher respiratory rate, slower capillary refill time, lower temperature, inability to drink or feed, far distance to the nearest hospital, and residing in a low population density area were in the most influential 25 predictors of death. Two clinical diagnoses/syndromes were among the most influential 50 predictors of mortality (Appendix 3, Fig. S1); clinician diagnosed sepsis increased the predicted risk of death while caregiver reported diarrhoea reduced the predicted risk. Some clinical syndromes are included in the description of 30-day mortality clusters below.
Among the six 30-day mortality clusters ( Table 2, Fig. 3), 1915 children (61%) fell within the two lowest mortality clusters. Cluster one included 727 children (23%) and six deaths (1% cumulative mortality) with a median of 0 days between enrolment and death (IQR 0-0 days). This lowest mortality cluster had a mean MUAC that was 1.6 SD higher than the mean of the other clusters (Fig. 4). Similarly, WAZ (1.5 SD), HAZ (1.0 SD), and WHZ (1.4 SD) were higher in cluster one than the other clusters. Cluster two included 1188 children (38%), and thirty deaths (3% cumulative mortality) with a median time to death of four days (IQR: 1-9). This second lowest mortality cluster had no features that were greater than 0.5 SD from the mean of other clusters.
Sensitivity analyses in which the clustering algorithm returned four, five, seven and eight clusters are detailed in Appendix 3, Fig. S2. The composition of the 30-day mortality clusters identified using six clusters was remarkably similar across these analyses, with the two lowest risk clusters remaining largely unchanged. Similarly, a cluster defined by higher serum urea and creatinine and one or more clusters of children with nutritional oedema and lower serum albumin were shared across the sensitivity analyses.

Post-discharge mortality
In the 6-months following hospital discharge, 2874 children were discharged alive and 166 (6%) children died. The median time to death was 45 days (IQR of 15-98 days). The median C-statistic (AUC) of the postdischarge XGBoost models was 0.74 (IQR: 0.68, 0.78). The 25 most informative variables were dominated by anthropometric, haematological, and biochemical parameters (Fig. 2). Low WAZ was the strongest predictor of mortality. Increased length of stay was also an important predictor, as were younger age, male sex, and having a lower heart rate at discharge. Caregiver anthropometric measures were also in the most influential 25 predictors. No WHO/IMCI syndromes associated with the index hospitalisation were among the most influential predictors, but we do include WHO/ IMCI syndromes in the description of post-discharge mortality clusters below.
Two low mortality clusters were identified (    Each participant has one dot on each variable line, this dot is colored by the value of that variable-pink for a high value, blue for a low value, grey for a missing value. The dots are positioned along the x-axis according to contribution of that variable to the child's predicted risk, left-side indicating the variable lower the predicted risk and the right-side increased the risk. For example, the pink dots on the left side of the MUAC variable in the 30-day model indicate that high MUAC was associated with lower risk of death. adm-admission, Alk. phosphatase-Alkaline Phosphatase, ALT-Alanine transaminase, disch.-discharge, HAZ-height-for-age z-score, hosp. -hospital, ICU-intensive care unit admission, Inorg. phosphate-inorganic phosphate, MUAC-mid upper arm circumference, WAZ-weight-for-age z-score, WLZ-weight-forheight z-score. cluster was 56 (IQR: 23, 100). This cluster was defined by a longer length of stay in hospital (1.1 SD) but without clear differences in anthropometric status or clinical signs at admission or discharge.
Clusters E and F included 267 (9%) of the children and had extremely high mortality rates with 65 (30% cumulative mortality) and 24 (46% cumulative mortality) deaths respectively. Cluster E contained 215 (7%) children, with a median of 30 (IQR 8, 70) days to death. Children in this cluster more often left against medical advice (1.1 SD) and had a reduced consciousness level at discharge (2.0 SD). Cluster E was also defined by lower anthropometry at discharge (−1.2 SD MUAC, −1.8 SD WLZ, −1.4 SD WAZ, −0.6 SD HAZ). Cluster F included 52 (2%) children with a median of 36 (IQR 18, 91) days to death. These children were characterised by very low anthropometry (−2.4 SD MUAC, −1.3 SD WLZ, −2.4 SD WAZ, −2.4 SD HAZ) and signs of illness at discharge, including higher respiratory rates (0.7 SD), higher creatinine (0.9 SD), and higher urea 0.6 (SD). Children in cluster F were also younger than other clusters (−0.8 SD).
Sensitivity analyses varying the number of postdischarge clusters were less consistent than the 30-day model (Appendix 3, Fig. S7). Most of these sensitivity analyses included one or more low risk clusters of children with higher anthropometry and shorter hospital stay, and at least one high risk cluster of children with wasting and either reduced consciousness or high creatinine. A cluster defined by lower red blood cells and positive malaria tests was identified in six, seven, and eight cluster variations. The eight cluster analysis split cluster D, defined by longer hospital stay without a clear anthropometric or clinical pattern, into three different clusters, defined by children with long stays, a low lymphocyte counts, and a third by raised alkaline phosphatase.

Simplification and models' performance
We repeated the train and test pipeline with a decreasing number of variables, beginning with the 50 most informative predictors and iteratively dropping the least informative variables (Appendix 3, Fig. S9a). In the 30-day dataset, there were modest declines in C-statistic between models using 50 variables and those using 10 variables. Fewer than 10 variables substantially decreased 30-day performance. The post-discharge mortality mean C-statistic was maintained between 0.73 and 0.80 in models including 8-50 variables, but model performance became substantially weaker with less than eight variables (Appendix 3, Fig. S9b).

Discussion
Risk phenotypes were identified among children hospitalised with acute illnesses under two years of age by a discrete set of anthropometric, clinical and laboratory features. These machine-generated clusters suggest that a large proportion of children were at low risk of both 30-day and post-discharge mortality, while much smaller clusters of children were at very high risk of death. Echoing our investigator-driven analysis, 5 these models suggest that syndrome-agnostic risk stratification at admission and discharge could be used to redistribute resources toward high-risk groups. Among lower risk children, risk stratification may limit the use of unnecessary interventions thus reducing nosocomial infection, 19 antimicrobial resistance, and catastrophic household medical expenses. 20 Both models selected laboratory variables to be among the best predictors of mortality, including red blood cell, platelet and lymphocyte counts, in addition to creatinine, urea, and albumin levels. Risk-based management algorithms would benefit from inclusion of these parameters, but consistent provision of clinical laboratory tests are limited by logistic and financial barriers. 21,22 Uptake of these simple laboratory tests into current guidelines is hampered by stock outs, poor quality control, and barriers to the timely return of result to the treating clinicians. However, large investments in 30 IQR-interquartile range. a Kaplan Meier estimated cumulative mortality proportion were 1.0 would indicated 100% mortality in the observed period, and 0.5 would indicate 50% mortality. Articles new technologies and laboratory services for HIV, malaria and tuberculosis have helped 84% of people living with HIV to know their status, 23 revolutionised management guidelines for fever, 24 and facilitated a 20% increase in testing for drug resistance among patients with tuberculosis between 2017 and 2019. 25 Similar investments in routine laboratory testing could have profound impact on the inpatient and post-discharge management of paediatric illness. Anthropometric indices are important prognostic measures, 26,27 and causal models built using this dataset show that anthropometry captures a broad range of pathways to mortality. 5 Children with nutritional oedema were identified by our model as a unique mortality cluster. Nutritional oedema, or kwashiorkor, is a condition characterised by bi-pedal oedema and is associated childhood wasting, but with an unclear aetiology. Children with nutritional oedema have discernibly different protein, lipid and glucose metabolism in comparison to children with wasting. 28 These children also have a reduced capacity to handle oxidative stress, and specific faecal microbiome changes. 28 Collectively, these biological differences lend credibility to our observation that nutritional oedema should be considered a unique risk phenotype.
Specific WHO/IMCI clinical syndromes, were not important predictors of risk in our models. All enrolled children were acutely unwell, and these syndromes  Table S3. would be associated with mortality if they were compared to healthy children. Children included in this study were managed at referral facilities where uncomplicated diarrhoea, malaria and pneumonia are managed very effectively. 29 The majority of deaths occurred among children with severe complications (e.g., renal insufficiency) or comorbidities (e.g., wasting). Incremental changes to WHO/IMCI syndromic guidelines may have a limited impact on inpatient and post-discharge mortality, and larger reductions may require a focus on the identification and management of cases with complications or comorbidities.
In predicative modelling, the simplest model that approximates the best predictive performance is usually preferred. 6 We developed a complex model, with access to hundreds of variables, but found models with 8-10 variables largely replicated predicative performance. Our models were only modestly better than previous developed algorithms using simpler techniques. 2,11,27,30 This suggests that the small increase in predictive performance that a bedside artificial intelligence application might yield over a simple clinical algorithm may not justify the cost of implementing such a system. It is unlikely that there is an undiscovered combination of clinical signs and symptoms at admission or discharge that will predict mortality with extremely high accuracy. Developing algorithms based on changes in clinical status over fixed time periods, e.g., response to 24-or 48h of treatment, may be more useful to clinicians than attempts to rearrange signs as a single timepoint. 30 The CHAIN cohort was a multi-country study that achieved harmonised and systematic data collection with very high retention among a large and heterogenous population of children. However, our analysis also has limitations. We cannot make causal inference and any observed associations should be tested in external datasets using a causal inference framework where confounders such as site can be included in the model. Variables in our models are conditional on all the other variables within the model. For example, a pneumonia diagnosis is not highly ranked in our model because mediators, such as low pulse oximetry and high heart rate, were also in the model. We were not able to include variables related to socioeconomic status of children in the 30-day model, this limits the 30-day model to a biomedical view of acute mortality and omits the social context of those deaths. There are laboratory tests, or other clinical variables, that may be more informative than those available in this dataset, such as C-reactive protein. These data were also collected at research phosphatase-Alkaline Phosphatase, ALT-Alanine transaminase, disch-discharge, HAZ-height-for-age z-score, HIV-human immunodeficiency virus, hosp.-hospital, Inorg. phosphate-inorganic phosphate, LAMA-leaving against medical advice, MUAC-mid upper arm circumference, RDT-rapid diagnostic test, WAZ-weight-for-age z-score, WLZ-weight-for-height z-score.
facilities with active monitoring of syndrome specific guideline adherence, which may limit generalisability, and may explain the weak associations between syndromes and mortality. Finally, despite achieving low lost-to-follow-up, it is still possible that the unknown outcomes of these children could bias the result, if they were disproportionately more likely to die.
In conclusion, novel clusters of children with acute illnesses at both high and low risk of mortality were defined by cross-cutting risk factors, including lower anthropometry, low red cell counts, low white cell counts, high platelet counts, hypoalbuminemia, and biochemical markers of renal insufficiency. WHO/IMCI syndromes did not substantially contribute to risk prediction, suggesting that severity of illness, rather than cause, is more important in predicting outcome. However, these complex models only offered modest improvements in accuracy compared simpler tools using a limited subset of variables. This suggests that investing in artificial intelligence algorithms to improve LMIC paediatric management may prove less effective than expanding access to reliable bedside or laboratory assessment. Incorporating these laboratory measures into clinical guidelines, may help target life-saving resources at children at highest risk of death which may lower the mortality during and after acute childhood illnesses. KDT, NN, GG and SF all access to the data and verified the results presented.

Declaration of interests
All authors declare no competing interests.